我期望能夠有個簡單的環境來檢視我所做的action
是對的還是錯的,所以我今天想要試著寫一個簡單的模擬器,並反饋簡單的賺賠狀況。 鑒於完全沒有碰過,實在撞了不少牆壁Orz,最後才簡單地寫了一個版本出來。
要鞭要打...我用聽的就好-3-
另一個希望一個這樣的環境還有一個原因,接下來我會用不同的模型進行測試,這可以是一個評斷訓練結果好壞的工具!! 話不多說,上程式碼...
import numpy as np
class stockEvn_single:
__lot = 0
__mean_price = 0
__state_assets = 0
def __init__(self, money=1000):
self.__money = money
self.__money_init = money
self.__state_assets = self.getAssets()
def getInfo(self):
print('stock count : ', self.__lot, ', stock value : ', self.__mean_price, ', money leave : ', self.__money,
', assets', self.__state_assets)
def doAction(self, stock_price, action, s_prop):
'''
stock_price : the price of stock
action : 0, sell; 1, do nothing, 2, buy
s_prop : a float between 0-1
'''
return self.__calculate_price(stock_price=stock_price, action=action, s_prop=s_prop)
def __calculate_price(self, stock_price, action, s_prop):
if action == 0:
can_sell = self.__lot
sell = int(s_prop * can_sell)
self.__doSell(stock_price=stock_price, quan=sell)
elif action == 1:
self.__doNothing()
elif action == 2:
can_buy = int(self.__money / stock_price)
buy = int(s_prop * can_buy)
self.__doBuy(stock_price=stock_price, quan=buy)
now_state = self.__getAssetsState(stock_price=stock_price)
loss = now_state - self.__state_assets
self.__state_assets = now_state
return loss
def __doSell(self, stock_price, quan):
if quan < 1:
return self.__doNothing()
else:
if stock_price < self.__mean_price:
loss = 1
else:
loss = 0
self.__lot -= quan
self.__money += (quan * stock_price)
return
def __doBuy(self, stock_price, quan):
if quan < 1:
return self.__doNothing()
else:
if stock_price > self.__mean_price:
loss = 1
else:
loss = 0
value_stock = self.__lot * self.__mean_price
self.__lot += quan
value_stock += (quan * stock_price)
self.__mean_price = value_stock / self.__lot
self.__money -= (quan * stock_price)
return
def __doNothing(self):
return
def getAssets(self):
return self.__money + self.__lot * self.__mean_price
def __getAssetsState(self, stock_price):
return self.__lot * stock_price + self.__money
env = stockEvn_single()
env.getInfo()
get_loss = []
get_loss.append(env.doAction(stock_price=200, action=2, s_prop=0.6))
env.getInfo()
get_loss.append(env.doAction(stock_price=210, action=2, s_prop=1))
env.getInfo()
get_loss.append(env.doAction(stock_price=250, action=2, s_prop=1))
env.getInfo()
get_loss.append(env.doAction(stock_price=340, action=0, s_prop=0.6))
env.getInfo()
get_loss.append(env.doAction(stock_price=390, action=2, s_prop=1))
env.getInfo()
get_loss.append(env.doAction(stock_price=300, action=0, s_prop=0))
env.getInfo()
print('loss = ', get_loss)
stock count : 0 , stock value : 0 , money leave : 1000 , assets 1000
stock count : 3 , stock value : 200.0 , money leave : 400 , assets 1000
stock count : 4 , stock value : 202.5 , money leave : 190 , assets 1030
stock count : 4 , stock value : 202.5 , money leave : 190 , assets 1190
stock count : 2 , stock value : 202.5 , money leave : 870 , assets 1550
stock count : 4 , stock value : 296.25 , money leave : 90 , assets 1650
stock count : 4 , stock value : 296.25 , money leave : 90 , assets 1290
loss = [0, 30, 160, 360, 100, -360]
doAction
會觸發一次更新,loss就是跟上一次的總資產做比較,較低就是負值。我原本是跟「購入時的價值」做比較,但比較不直觀,之後如果訓練要用到在改變XD接下來會嘗試在台股上進行分析,並且透過這個環境(的延伸版)來做預測評斷好或是不好,但最近事情好多RRRR,我覺得當天要出實驗結果都很難,除了時間關係,還有訓練時間問題...模型一大訓練就久。所以接下來可能會是一天放實作一天補充我前幾篇沒有講的但我有看的東西XD 拜託讓我混分一下 哈哈哈